Longest Common Subsequence Algorithm
The Longest Common Subsequence (LCS) Algorithm is a powerful approach used in computer science, bioinformatics, and data analysis to find the longest sequence of characters or elements that two or more sequences have in common. This dynamic programming technique is particularly useful in comparing strings or sequences, and it plays a crucial role in various applications such as DNA sequence alignment, file comparison, and natural language processing. The LCS problem can be defined as follows: given two sequences X and Y, find the longest subsequence that is common to both X and Y without altering the order of elements in the original sequences.
The LCS algorithm essentially breaks the problem down into smaller subproblems and uses a tabular method to store the solutions of these subproblems in a bottom-up manner. It starts by constructing a matrix where the rows represent the elements of one sequence and the columns represent the elements of the other sequence. The algorithm then iterates through the matrix, comparing the corresponding elements of the two sequences. If the elements match, it adds 1 to the value in the diagonal cell above and to the left, indicating that the common subsequence has been extended by one character. If the elements do not match, the algorithm takes the maximum value from either the cell above or the cell to the left. The bottom-right cell of the matrix will then contain the length of the longest common subsequence, and the subsequence itself can be reconstructed by backtracking through the matrix, following the path of maximum values.
/*
Petar 'PetarV' Velickovic
Algorithm: Longest Common Subsequence
*/
#include <stdio.h>
#include <math.h>
#include <string.h>
#include <iostream>
#include <vector>
#include <list>
#include <string>
#include <algorithm>
#include <queue>
#include <stack>
#include <set>
#include <map>
#include <complex>
#define MAX_N 1001
using namespace std;
typedef long long lld;
int n, m;
string A, B;
int dp[MAX_N][MAX_N];
//Algoritam koji racuna najduzu zajednicku podsekvencu dva stringa
//Slozenost: O(n*m)
inline int LCS()
{
for (int i=0;i<=n;i++) dp[i][0] = 0;
for (int j=0;j<=m;j++) dp[0][j] = 0;
for (int i=1;i<=n;i++)
{
for (int j=1;j<=m;j++)
{
if (A[i-1] == B[j-1])
{
dp[i][j] = dp[i-1][j-1] + 1;
}
else
{
dp[i][j] = max(dp[i][j-1], dp[i-1][j]);
}
}
}
return dp[n][m];
}
inline string getLCS()
{
string ret;
stack<char> S;
int ii = n, jj = m;
while (ii != 0 && jj != 0)
{
if (A[ii-1] == B[jj-1])
{
S.push(A[ii-1]);
ii--; jj--;
}
else if (dp[ii-1][jj] > dp[ii][jj-1]) ii--;
else jj--;
}
while (!S.empty())
{
ret += S.top();
S.pop();
}
return ret;
}
int main()
{
n = 5, m = 6;
A = "aleks";
B = "abcdef";
printf("%d\n",LCS());
printf("%s\n",getLCS().c_str());
return 0;
}